Grammatical word class variation within the British National Corpus Sampler

نویسندگان

  • Paul Rayson
  • Andrew Wilson
  • Geoffrey Leech
چکیده

This paper examines the relationship between part-of-speech frequencies and text typology in the British National Corpus Sampler. Four pairwise comparisons of part-of-speech frequencies were made: written language vs. spoken language; informative writing vs. imaginative writing; conversational speech vs. ‘task-oriented’ speech; and imaginative writing vs. ‘task-oriented’ speech. The following variation gradient was hypothesized: conversation – task-oriented speech – imaginative writing – informative writing; however, the actual progression was: conversation – imaginative writing – task-oriented speech – informative writing. It thus seems that genre and medium interact in a more complex way than originally hypothesized. However, this conclusion has been made on the basis of broad, pre-existing text types within the BNC, and, in future, the internal structure of these text types may need to be addressed.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Unsupervised All-words Word Sense Disambiguation with Grammatical Dependencies

We present experiments that analyze the necessity of using a highly interconnected word/sense graph for unsupervised allwords word sense disambiguation. We show that allowing only grammatically related words to influence each other’s senses leads to disambiguation results on a par with the best graph-based systems, while greatly reducing the computation load. We also compare two methods for com...

متن کامل

Using the BNC to produce

This paper describes an attempt to generate seemingly meaningful cryptic crossword clues without trying to analyse meaning but relying solely on word occurrence statistics. It is a continuation of a project in which I developed an application toolkit for cryptic crossword clue compilers. The software described here assembles simple cryptic clues using the resources developed in the earlier proj...

متن کامل

Investigating the collocational behaviour of MAN and WOMAN in the BNC using Sketch Engine

In this paper, I examine the representation of men and women in the British National Corpus (BNC) by focussing on the collocational and grammatical behaviour of the noun lemmas MAN and WOMAN (i.e., the nouns man/men and woman/women). Using Sketch Engine (a powerful corpus query tool, which is described) I explore the functional distribution of the target lemmas, and reveal the structured and sy...

متن کامل

Linggle Knows: A Search Engine Tells How People Write

This paper presents Linggle Knows, an English grammar and linguistic search engine. Linggle Knows help people writing by displaying lexical and grammatical information extracted from a couple of large scale corpora, including Google Web 1T 5-gram, British National Corpus (BNC), New York Times Annotated Corpus (NYT), etc. It not only describes how a word is genuinely used, but also recommends va...

متن کامل

Claws4: The Tagging Of The British National Corpus

The main purpose of this paper is to describe the CLAWS4 general-purpose grammatical tagger, used for the tagging of the 100-million-word British National Corpus, of which c.70 million words have been tagged at the time of writing (April 1994)) We will emphasise the goals of (a) gener~d-purpose adaptability, (b) incorporation of linguistic knowledge to improve quality ,and consistency, and (c) ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001